Overview

Dataset statistics

Number of variables8
Number of observations768
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory48.1 KiB
Average record size in memory64.2 B

Variable types

Numeric7
Categorical1

Alerts

Age is highly overall correlated with PregnanciesHigh correlation
BMI is highly overall correlated with SkinThicknessHigh correlation
Pregnancies is highly overall correlated with AgeHigh correlation
SkinThickness is highly overall correlated with BMIHigh correlation
Pregnancies has 111 (14.5%) zerosZeros

Reproduction

Analysis started2024-03-06 02:30:27.559096
Analysis finished2024-03-06 02:30:42.371575
Duration14.81 seconds
Software versionydata-profiling vv4.6.5
Download configurationconfig.json

Variables

Pregnancies
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0963542
Minimum0
Maximum8
Zeros111
Zeros (%)14.5%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-03-06T02:30:42.499908image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile7
Maximum8
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.3663409
Coefficient of variation (CV)0.76423459
Kurtosis-0.87335958
Mean3.0963542
Median Absolute Deviation (MAD)2
Skewness0.44944067
Sum2378
Variance5.5995695
MonotonicityNot monotonic
2024-03-06T02:30:42.746882image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 140
18.2%
2 113
14.7%
0 111
14.5%
3 93
12.1%
4 89
11.6%
5 76
9.9%
6 63
8.2%
7 45
 
5.9%
8 38
 
4.9%
ValueCountFrequency (%)
0 111
14.5%
1 140
18.2%
2 113
14.7%
3 93
12.1%
4 89
11.6%
5 76
9.9%
6 63
8.2%
7 45
 
5.9%
8 38
 
4.9%
ValueCountFrequency (%)
8 38
 
4.9%
7 45
 
5.9%
6 63
8.2%
5 76
9.9%
4 89
11.6%
3 93
12.1%
2 113
14.7%
1 140
18.2%
0 111
14.5%

Glucose
Real number (ℝ)

Distinct136
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean121.68678
Minimum44
Maximum199
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-03-06T02:30:43.023844image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum44
5-th percentile80
Q199.75
median117
Q3140.25
95-th percentile181
Maximum199
Range155
Interquartile range (IQR)40.5

Descriptive statistics

Standard deviation30.435949
Coefficient of variation (CV)0.25011713
Kurtosis-0.25916009
Mean121.68678
Median Absolute Deviation (MAD)20
Skewness0.53271658
Sum93455.45
Variance926.34698
MonotonicityNot monotonic
2024-03-06T02:30:43.319337image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99 17
 
2.2%
100 17
 
2.2%
111 14
 
1.8%
129 14
 
1.8%
125 14
 
1.8%
106 14
 
1.8%
112 13
 
1.7%
108 13
 
1.7%
95 13
 
1.7%
105 13
 
1.7%
Other values (126) 626
81.5%
ValueCountFrequency (%)
44 1
 
0.1%
56 1
 
0.1%
57 2
0.3%
61 1
 
0.1%
62 1
 
0.1%
65 1
 
0.1%
67 1
 
0.1%
68 3
0.4%
71 4
0.5%
72 1
 
0.1%
ValueCountFrequency (%)
199 1
 
0.1%
198 1
 
0.1%
197 4
0.5%
196 3
0.4%
195 2
0.3%
194 3
0.4%
193 2
0.3%
191 1
 
0.1%
190 1
 
0.1%
189 4
0.5%

BloodPressure
Real number (ℝ)

Distinct44
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.40776
Minimum36.12
Maximum108.69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-03-06T02:30:43.618314image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum36.12
5-th percentile52
Q164
median72.205
Q380
95-th percentile90
Maximum108.69
Range72.57
Interquartile range (IQR)16

Descriptive statistics

Standard deviation11.88779
Coefficient of variation (CV)0.16417839
Kurtosis0.53011105
Mean72.40776
Median Absolute Deviation (MAD)7.795
Skewness0.15031486
Sum55609.16
Variance141.31955
MonotonicityNot monotonic
2024-03-06T02:30:43.906264image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
70 57
 
7.4%
74 52
 
6.8%
68 45
 
5.9%
78 45
 
5.9%
72 44
 
5.7%
64 43
 
5.6%
80 40
 
5.2%
76 39
 
5.1%
60 37
 
4.8%
72.41 35
 
4.6%
Other values (34) 331
43.1%
ValueCountFrequency (%)
36.12 3
 
0.4%
38 1
 
0.1%
40 1
 
0.1%
44 4
 
0.5%
46 2
 
0.3%
48 5
 
0.7%
50 13
1.7%
52 11
1.4%
54 11
1.4%
55 2
 
0.3%
ValueCountFrequency (%)
108.69 5
0.7%
108 2
 
0.3%
106 3
0.4%
104 2
 
0.3%
102 1
 
0.1%
100 3
0.4%
98 3
0.4%
96 4
0.5%
95 1
 
0.1%
94 6
0.8%

SkinThickness
Real number (ℝ)

HIGH CORRELATION 

Distinct48
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.079648
Minimum7
Maximum55.53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-03-06T02:30:44.434483image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile14.35
Q125
median29.15
Q332
95-th percentile44
Maximum55.53
Range48.53
Interquartile range (IQR)7

Descriptive statistics

Standard deviation8.4183786
Coefficient of variation (CV)0.28949382
Kurtosis0.57815168
Mean29.079648
Median Absolute Deviation (MAD)3.85
Skewness0.18516259
Sum22333.17
Variance70.869099
MonotonicityNot monotonic
2024-03-06T02:30:44.721745image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
29.15 227
29.6%
32 31
 
4.0%
30 27
 
3.5%
27 23
 
3.0%
23 22
 
2.9%
28 20
 
2.6%
33 20
 
2.6%
18 20
 
2.6%
31 19
 
2.5%
19 18
 
2.3%
Other values (38) 341
44.4%
ValueCountFrequency (%)
7 2
 
0.3%
8 2
 
0.3%
10 5
 
0.7%
11 6
0.8%
12 7
0.9%
13 11
1.4%
14 6
0.8%
15 14
1.8%
16 6
0.8%
17 14
1.8%
ValueCountFrequency (%)
55.53 4
0.5%
54 2
 
0.3%
52 2
 
0.3%
51 1
 
0.1%
50 3
 
0.4%
49 3
 
0.4%
48 4
0.5%
47 4
0.5%
46 8
1.0%
45 6
0.8%

Insulin
Real number (ℝ)

Distinct170
Distinct (%)22.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean152.30376
Minimum14
Maximum410.61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-03-06T02:30:45.034205image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile50
Q1121.5
median155.55
Q3155.55
95-th percentile293
Maximum410.61
Range396.61
Interquartile range (IQR)34.05

Descriptive statistics

Standard deviation69.673124
Coefficient of variation (CV)0.4574616
Kurtosis4.2308259
Mean152.30376
Median Absolute Deviation (MAD)3.5
Skewness1.5199555
Sum116969.29
Variance4854.3442
MonotonicityNot monotonic
2024-03-06T02:30:45.321871image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
155.55 374
48.7%
410.61 19
 
2.5%
105 11
 
1.4%
140 9
 
1.2%
130 9
 
1.2%
120 8
 
1.0%
100 7
 
0.9%
180 7
 
0.9%
94 7
 
0.9%
135 6
 
0.8%
Other values (160) 311
40.5%
ValueCountFrequency (%)
14 1
 
0.1%
15 1
 
0.1%
16 1
 
0.1%
18 2
0.3%
22 1
 
0.1%
23 2
0.3%
25 1
 
0.1%
29 1
 
0.1%
32 1
 
0.1%
36 3
0.4%
ValueCountFrequency (%)
410.61 19
2.5%
402 1
 
0.1%
392 1
 
0.1%
387 1
 
0.1%
375 1
 
0.1%
370 1
 
0.1%
360 1
 
0.1%
342 1
 
0.1%
335 1
 
0.1%
330 1
 
0.1%

BMI
Real number (ℝ)

HIGH CORRELATION 

Distinct244
Distinct (%)31.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.422865
Minimum18.2
Maximum53.08
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-03-06T02:30:45.615215image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum18.2
5-th percentile22.235
Q127.5
median32.4
Q336.6
95-th percentile44.395
Maximum53.08
Range34.88
Interquartile range (IQR)9.1

Descriptive statistics

Standard deviation6.7453475
Coefficient of variation (CV)0.20804292
Kurtosis0.041751338
Mean32.422865
Median Absolute Deviation (MAD)4.6
Skewness0.42318956
Sum24900.76
Variance45.499713
MonotonicityNot monotonic
2024-03-06T02:30:45.903576image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32 13
 
1.7%
31.6 12
 
1.6%
31.2 12
 
1.6%
32.46 11
 
1.4%
32.4 10
 
1.3%
33.3 10
 
1.3%
30.1 9
 
1.2%
30.8 9
 
1.2%
32.8 9
 
1.2%
32.9 9
 
1.2%
Other values (234) 664
86.5%
ValueCountFrequency (%)
18.2 3
0.4%
18.4 1
 
0.1%
19.1 1
 
0.1%
19.3 1
 
0.1%
19.4 1
 
0.1%
19.5 2
0.3%
19.6 3
0.4%
19.9 1
 
0.1%
20 1
 
0.1%
20.1 1
 
0.1%
ValueCountFrequency (%)
53.08 5
0.7%
52.9 1
 
0.1%
52.3 2
 
0.3%
50 1
 
0.1%
49.7 1
 
0.1%
49.6 1
 
0.1%
49.3 1
 
0.1%
48.8 1
 
0.1%
48.3 1
 
0.1%
47.9 2
 
0.3%

Age
Real number (ℝ)

HIGH CORRELATION 

Distinct49
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.216927
Minimum21
Maximum68.52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-03-06T02:30:46.178643image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile21
Q124
median29
Q341
95-th percentile58
Maximum68.52
Range47.52
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.678506
Coefficient of variation (CV)0.35158297
Kurtosis0.41569322
Mean33.216927
Median Absolute Deviation (MAD)7
Skewness1.0868219
Sum25510.6
Variance136.3875
MonotonicityNot monotonic
2024-03-06T02:30:46.471856image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
22 72
 
9.4%
21 63
 
8.2%
25 48
 
6.2%
24 46
 
6.0%
23 38
 
4.9%
28 35
 
4.6%
26 33
 
4.3%
27 32
 
4.2%
29 29
 
3.8%
31 24
 
3.1%
Other values (39) 348
45.3%
ValueCountFrequency (%)
21 63
8.2%
22 72
9.4%
23 38
4.9%
24 46
6.0%
25 48
6.2%
26 33
4.3%
27 32
4.2%
28 35
4.6%
29 29
3.8%
30 21
 
2.7%
ValueCountFrequency (%)
68.52 5
0.7%
68 1
 
0.1%
67 3
0.4%
66 4
0.5%
65 3
0.4%
64 1
 
0.1%
63 4
0.5%
62 4
0.5%
61 2
 
0.3%
60 5
0.7%

Outcome
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
0
500 
1
268 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Length

2024-03-06T02:30:46.786831image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-06T02:30:47.247261image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring characters

ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 768
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring scripts

ValueCountFrequency (%)
Common 768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Interactions

2024-03-06T02:30:39.760070image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:28.001812image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:29.814237image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:31.661440image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:33.485821image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:35.412846image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:37.209393image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:40.185730image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:28.282641image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:30.098180image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:31.925757image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:33.735225image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:35.672896image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:37.502969image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:40.553436image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:28.546261image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:30.377403image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:32.181177image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:33.986049image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:35.941030image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:37.854990image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:40.809087image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:28.811250image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:30.631768image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:32.457345image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:34.228929image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:36.193674image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:38.263844image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:41.030584image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:29.062625image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:30.884528image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:32.703419image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:34.466475image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:36.455391image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:38.629534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:41.280903image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:29.334129image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:31.143236image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:32.978737image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:34.721725image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:36.713489image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:39.024776image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:41.521401image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:29.580938image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:31.405082image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:33.235884image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:35.147938image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:36.971302image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-06T02:30:39.362999image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2024-03-06T02:30:47.424242image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
AgeBMIBloodPressureGlucoseInsulinOutcomePregnanciesSkinThickness
Age1.0000.1200.3640.2810.1990.3370.5400.184
BMI0.1201.0000.2900.2250.1720.311-0.0280.545
BloodPressure0.3640.2901.0000.2420.1070.1490.1470.204
Glucose0.2810.2250.2421.0000.4020.4790.1130.186
Insulin0.1990.1720.1070.4021.0000.2460.1170.190
Outcome0.3370.3110.1490.4790.2461.0000.1560.214
Pregnancies0.540-0.0280.1470.1130.1170.1561.0000.080
SkinThickness0.1840.5450.2040.1860.1900.2140.0801.000

Missing values

2024-03-06T02:30:41.862432image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-06T02:30:42.196835image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIAgeOutcome
06148.072.0035.00155.5533.6050.01
1185.066.0029.00155.5526.6031.00
28183.064.0029.15155.5523.3032.01
3189.066.0023.0094.0028.1021.00
40137.040.0035.00168.0043.1033.01
55116.074.0029.15155.5525.6030.00
6378.050.0032.0088.0031.0026.01
74115.072.4129.15155.5535.3029.00
82197.070.0045.00410.6130.5053.01
98125.096.0029.15155.5532.4654.01
PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIAgeOutcome
7581106.076.029.15155.5537.526.00
7596190.092.029.15155.5535.566.01
760288.058.026.0016.0028.422.00
7614170.074.031.00155.5544.043.01
762589.062.029.15155.5522.533.00
7636101.076.048.00180.0032.963.00
7642122.070.027.00155.5536.827.00
7655121.072.023.00112.0026.230.00
7661126.060.029.15155.5530.147.01
767193.070.031.00155.5530.423.00